Training Bayesian neural networks with measure optimisation algorithms
Sergei Zuyev (Chalmers)
Abstract: On a high abstraction level, a Bayesian neural network (BNN) can be seen as a function of input data and their prior probability distribution which yields, among other outputs, their estimated posterior probability distribution. This distribution is a result of optimisation of a chosen score function aiming to favour these probability distributions which describe best the observed data and take into account the prior distribution.
Instead of constraint optimisation over the simplex of probability distributions, it is typical to map this simplex into Euclidean space, for example with Softmax function or its variants, and then do optimisation in the whole space without constraints. It is, however, widely acknowledged that such mapping often suffers from undesirable properties for optimisation and stability of the algorithms. To counterfeit this, a few regularisation procedures have been proposed in the literature.
Instead of trying to modify the mapping approach, we suggest turning back to optimisation on the original simplex using recently developed algorithms for constrained optimisation of functionals of measures. We demonstrate that our algorithms run tens times faster than the standard algorithms involving softmax mapping and lead to exact solutions rather than to their approximations.
machine learningprobabilitystatistics theory
Audience: researchers in the discipline
Series comments: Gothenburg statistics seminar is open to the interested public, everybody is welcome. It usually takes place in MVL14 (http://maps.chalmers.se/#05137ad7-4d34-45e2-9d14-7f970517e2b60, see specific talk). Speakers are asked to prepare material for 35 minutes excluding questions from the audience.
| Organizers: | Akash Sharma*, Helga Kristín Ólafsdóttir* |
| *contact for this listing |
